{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "23095649",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/llm/cohere.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9e3a8796-edc8-43f2-94ad-fe4fb20d70ed",
   "metadata": {},
   "source": [
    "# Cohere"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b007403c-6b7a-420c-92f1-4171d05ed9bb",
   "metadata": {},
   "source": [
    "## Basic Usage"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8ead155e-b8bd-46f9-ab9b-28fc009361dd",
   "metadata": {},
   "source": [
    "#### Call `complete` with a prompt"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "1f1b7133",
   "metadata": {},
   "source": [
    "If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f7b27c9f",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-llms-openai\n",
    "%pip install llama-index-llms-cohere"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "afb8de1e",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install llama-index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "60be18ae-c957-4ac2-a58a-0652e18ee6d6",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Your text contains a trailing whitespace, which has been trimmed to ensure high quality generations.\n"
     ]
    }
   ],
   "source": [
    "from llama_index.llms.cohere import Cohere\n",
    "\n",
    "api_key = \"Your api key\"\n",
    "resp = Cohere(api_key=api_key).complete(\"Paul Graham is \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ac2cbebb-a444-4a46-9d85-b265a3483d68",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "an English computer scientist, entrepreneur and investor. He is best known for his work as a co-founder of the seed accelerator Y Combinator. He is also the author of the free startup advice blog \"Startups.com\". Paul Graham is known for his philanthropic efforts. Has given away hundreds of millions of dollars to good causes.\n"
     ]
    }
   ],
   "source": [
    "print(resp)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14831268-f90f-499d-9d86-925dbc88292b",
   "metadata": {},
   "source": [
    "#### Call `chat` with a list of messages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bbe29574-4af1-48d5-9739-f60652b6ce6c",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.llms import ChatMessage\n",
    "from llama_index.llms.cohere import Cohere\n",
    "\n",
    "messages = [\n",
    "    ChatMessage(role=\"user\", content=\"hello there\"),\n",
    "    ChatMessage(\n",
    "        role=\"assistant\", content=\"Arrrr, matey! How can I help ye today?\"\n",
    "    ),\n",
    "    ChatMessage(role=\"user\", content=\"What is your name\"),\n",
    "]\n",
    "\n",
    "resp = Cohere(api_key=api_key).chat(\n",
    "    messages, preamble_override=\"You are a pirate with a colorful personality\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9cbd550a-0264-4a11-9b2c-a08d8723a5ae",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "assistant: Traditionally, ye refers to gender-nonconforming people of any gender, and those who are genderless, whereas matey refers to a friend, commonly used to address a fellow pirate. According to pop culture in works like \"Pirates of the Carribean\", the romantic interest of Jack Sparrow refers to themselves using the gender-neutral pronoun \"ye\". \n",
      "\n",
      "Are you interested in learning more about the pirate culture?\n"
     ]
    }
   ],
   "source": [
    "print(resp)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2ed5e894-4597-4911-a623-591560f72b82",
   "metadata": {},
   "source": [
    "## Streaming"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4cb7986f-aaed-42e2-abdd-f274f6d4fc59",
   "metadata": {},
   "source": [
    "Using `stream_complete` endpoint "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d43f17a2-0aeb-464b-a7a7-732ba5e8ef24",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llms.openai import OpenAI\n",
    "\n",
    "llm = Cohere(api_key=api_key)\n",
    "resp = llm.stream_complete(\"Paul Graham is \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0214e911-cf0d-489c-bc48-9bb1d8bf65d8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " an English computer scientist, essayist, and venture capitalist. He is best known for his work as a co-founder of the Y Combinator startup incubator, and his essays, which are widely read and influential in the startup community. "
     ]
    }
   ],
   "source": [
    "for r in resp:\n",
    "    print(r.delta, end=\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "40350dd8-3f50-4a2f-8545-5723942039bb",
   "metadata": {},
   "source": [
    "Using `stream_chat` endpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bc636e65-a67b-4dcd-ac60-b25abc9d8dbd",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llms.openai import OpenAI\n",
    "\n",
    "llm = Cohere(api_key=api_key)\n",
    "messages = [\n",
    "    ChatMessage(role=\"user\", content=\"hello there\"),\n",
    "    ChatMessage(\n",
    "        role=\"assistant\", content=\"Arrrr, matey! How can I help ye today?\"\n",
    "    ),\n",
    "    ChatMessage(role=\"user\", content=\"What is your name\"),\n",
    "]\n",
    "resp = llm.stream_chat(\n",
    "    messages, preamble_override=\"You are a pirate with a colorful personality\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4475a6bc-1051-4287-abce-ba83324aeb9e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Arrrr, matey! According to etiquette, we are suppose to exchange names first! Mine remains a mystery for now."
     ]
    }
   ],
   "source": [
    "for r in resp:\n",
    "    print(r.delta, end=\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "009d3f1c-ef35-4126-ae82-0b97adb746e3",
   "metadata": {},
   "source": [
    "## Configure Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e973e3d1-a3c9-43b9-bee1-af3e57946ac3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llms.cohere import Cohere\n",
    "\n",
    "llm = Cohere(model=\"command\", api_key=api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e2c9bcf6-c950-4dfc-abdc-598d5bdedf40",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Your text contains a trailing whitespace, which has been trimmed to ensure high quality generations.\n"
     ]
    }
   ],
   "source": [
    "resp = llm.complete(\"Paul Graham is \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2edc85ca-df17-4774-a3ea-e80109fa1811",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "an English computer scientist, entrepreneur and investor. He is best known for his work as a co-founder of the seed accelerator Y Combinator. He is also the co-founder of the online dating platform Match.com. \n"
     ]
    }
   ],
   "source": [
    "print(resp)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df5fa1ab-f598-46da-80f3-f6af5dd2fe83",
   "metadata": {},
   "source": [
    "## Async"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3e088b90-12b6-4211-a9ca-696e9897e9ae",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llms.cohere import Cohere\n",
    "\n",
    "llm = Cohere(model=\"command\", api_key=api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8f649683-896a-439e-b14b-e5df5d803b82",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Your text contains a trailing whitespace, which has been trimmed to ensure high quality generations.\n"
     ]
    }
   ],
   "source": [
    "resp = await llm.acomplete(\"Paul Graham is \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d05d8b7c-07b3-4ce8-9a8c-fcd1e830316c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "an English computer scientist, entrepreneur and investor. He is best known for his work as a co-founder of the startup incubator and seed fund Y Combinator, and the programming language Lisp. He has also written numerous essays, many of which have become highly influential in the software engineering field. \n"
     ]
    }
   ],
   "source": [
    "print(resp)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "abc72d09-3bcd-4d48-9bb7-0920c56310c1",
   "metadata": {},
   "outputs": [],
   "source": [
    "resp = await llm.astream_complete(\"Paul Graham is \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b28d7a06-1518-4ec0-b3cc-6364395b3561",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " an English computer scientist, essayist, and businessman. He is best known for his work as a co-founder of the startup accelerator Y Combinator, and his essay \"Beating the Averages.\" "
     ]
    }
   ],
   "source": [
    "async for delta in resp:\n",
    "    print(delta.delta, end=\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a2782f06",
   "metadata": {},
   "source": [
    "## Set API Key at a per-instance level\n",
    "If desired, you can have separate LLM instances use separate API keys."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "015c2d39",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Your text contains a trailing whitespace, which has been trimmed to ensure high quality generations.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "an English computer scientist, entrepreneur and investor. He is best known for his work as a co-founder of the acceleration program Y Combinator. He has also written extensively on the topics of computer science and entrepreneurship. Where did you come across his name? \n"
     ]
    },
    {
     "ename": "CohereAPIError",
     "evalue": "invalid api token",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mCohereAPIError\u001b[0m                            Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[17], line 9\u001b[0m\n\u001b[1;32m      6\u001b[0m resp \u001b[38;5;241m=\u001b[39m llm_good\u001b[38;5;241m.\u001b[39mcomplete(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mPaul Graham is \u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m      7\u001b[0m \u001b[38;5;28mprint\u001b[39m(resp)\n\u001b[0;32m----> 9\u001b[0m resp \u001b[38;5;241m=\u001b[39m \u001b[43mllm_bad\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcomplete\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mPaul Graham is \u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m     10\u001b[0m \u001b[38;5;28mprint\u001b[39m(resp)\n",
      "File \u001b[0;32m/workspaces/llama_index/gllama_index/llms/base.py:277\u001b[0m, in \u001b[0;36mllm_completion_callback.<locals>.wrap.<locals>.wrapped_llm_predict\u001b[0;34m(_self, *args, **kwargs)\u001b[0m\n\u001b[1;32m    267\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m wrapper_logic(_self) \u001b[38;5;28;01mas\u001b[39;00m callback_manager:\n\u001b[1;32m    268\u001b[0m     event_id \u001b[38;5;241m=\u001b[39m callback_manager\u001b[38;5;241m.\u001b[39mon_event_start(\n\u001b[1;32m    269\u001b[0m         CBEventType\u001b[38;5;241m.\u001b[39mLLM,\n\u001b[1;32m    270\u001b[0m         payload\u001b[38;5;241m=\u001b[39m{\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    274\u001b[0m         },\n\u001b[1;32m    275\u001b[0m     )\n\u001b[0;32m--> 277\u001b[0m     f_return_val \u001b[38;5;241m=\u001b[39m \u001b[43mf\u001b[49m\u001b[43m(\u001b[49m\u001b[43m_self\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    278\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(f_return_val, Generator):\n\u001b[1;32m    279\u001b[0m         \u001b[38;5;66;03m# intercept the generator and add a callback to the end\u001b[39;00m\n\u001b[1;32m    280\u001b[0m         \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mwrapped_gen\u001b[39m() \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m CompletionResponseGen:\n",
      "File \u001b[0;32m/workspaces/llama_index/gllama_index/llms/cohere.py:139\u001b[0m, in \u001b[0;36mCohere.complete\u001b[0;34m(self, prompt, **kwargs)\u001b[0m\n\u001b[1;32m    136\u001b[0m \u001b[38;5;129m@llm_completion_callback\u001b[39m()\n\u001b[1;32m    137\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mcomplete\u001b[39m(\u001b[38;5;28mself\u001b[39m, prompt: \u001b[38;5;28mstr\u001b[39m, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs: Any) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m CompletionResponse:\n\u001b[1;32m    138\u001b[0m     all_kwargs \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_get_all_kwargs(\u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n\u001b[0;32m--> 139\u001b[0m     response \u001b[38;5;241m=\u001b[39m \u001b[43mcompletion_with_retry\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    140\u001b[0m \u001b[43m        \u001b[49m\u001b[43mclient\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_client\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    141\u001b[0m \u001b[43m        \u001b[49m\u001b[43mmax_retries\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmax_retries\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    142\u001b[0m \u001b[43m        \u001b[49m\u001b[43mchat\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m    143\u001b[0m \u001b[43m        \u001b[49m\u001b[43mprompt\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mprompt\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    144\u001b[0m \u001b[43m        \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mall_kwargs\u001b[49m\n\u001b[1;32m    145\u001b[0m \u001b[43m    \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    147\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m CompletionResponse(\n\u001b[1;32m    148\u001b[0m         text\u001b[38;5;241m=\u001b[39mresponse\u001b[38;5;241m.\u001b[39mgenerations[\u001b[38;5;241m0\u001b[39m]\u001b[38;5;241m.\u001b[39mtext,\n\u001b[1;32m    149\u001b[0m         raw\u001b[38;5;241m=\u001b[39mresponse\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__dict__\u001b[39m,\n\u001b[1;32m    150\u001b[0m     )\n",
      "File \u001b[0;32m/workspaces/llama_index/gllama_index/llms/cohere_utils.py:74\u001b[0m, in \u001b[0;36mcompletion_with_retry\u001b[0;34m(client, max_retries, chat, **kwargs)\u001b[0m\n\u001b[1;32m     71\u001b[0m     \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m     72\u001b[0m         \u001b[38;5;28;01mreturn\u001b[39;00m client\u001b[38;5;241m.\u001b[39mgenerate(\u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n\u001b[0;32m---> 74\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43m_completion_with_retry\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/.local/share/projects/oss/llama_index/.venv/lib/python3.10/site-packages/tenacity/__init__.py:289\u001b[0m, in \u001b[0;36mBaseRetrying.wraps.<locals>.wrapped_f\u001b[0;34m(*args, **kw)\u001b[0m\n\u001b[1;32m    287\u001b[0m \u001b[38;5;129m@functools\u001b[39m\u001b[38;5;241m.\u001b[39mwraps(f)\n\u001b[1;32m    288\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mwrapped_f\u001b[39m(\u001b[38;5;241m*\u001b[39margs: t\u001b[38;5;241m.\u001b[39mAny, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkw: t\u001b[38;5;241m.\u001b[39mAny) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m t\u001b[38;5;241m.\u001b[39mAny:\n\u001b[0;32m--> 289\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43mf\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkw\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/.local/share/projects/oss/llama_index/.venv/lib/python3.10/site-packages/tenacity/__init__.py:379\u001b[0m, in \u001b[0;36mRetrying.__call__\u001b[0;34m(self, fn, *args, **kwargs)\u001b[0m\n\u001b[1;32m    377\u001b[0m retry_state \u001b[38;5;241m=\u001b[39m RetryCallState(retry_object\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m, fn\u001b[38;5;241m=\u001b[39mfn, args\u001b[38;5;241m=\u001b[39margs, kwargs\u001b[38;5;241m=\u001b[39mkwargs)\n\u001b[1;32m    378\u001b[0m \u001b[38;5;28;01mwhile\u001b[39;00m \u001b[38;5;28;01mTrue\u001b[39;00m:\n\u001b[0;32m--> 379\u001b[0m     do \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43miter\u001b[49m\u001b[43m(\u001b[49m\u001b[43mretry_state\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mretry_state\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    380\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(do, DoAttempt):\n\u001b[1;32m    381\u001b[0m         \u001b[38;5;28;01mtry\u001b[39;00m:\n",
      "File \u001b[0;32m~/.local/share/projects/oss/llama_index/.venv/lib/python3.10/site-packages/tenacity/__init__.py:314\u001b[0m, in \u001b[0;36mBaseRetrying.iter\u001b[0;34m(self, retry_state)\u001b[0m\n\u001b[1;32m    312\u001b[0m is_explicit_retry \u001b[38;5;241m=\u001b[39m fut\u001b[38;5;241m.\u001b[39mfailed \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(fut\u001b[38;5;241m.\u001b[39mexception(), TryAgain)\n\u001b[1;32m    313\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m (is_explicit_retry \u001b[38;5;129;01mor\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mretry(retry_state)):\n\u001b[0;32m--> 314\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfut\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mresult\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    316\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mafter \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m    317\u001b[0m     \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mafter(retry_state)\n",
      "File \u001b[0;32m/usr/lib/python3.10/concurrent/futures/_base.py:449\u001b[0m, in \u001b[0;36mFuture.result\u001b[0;34m(self, timeout)\u001b[0m\n\u001b[1;32m    447\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m CancelledError()\n\u001b[1;32m    448\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_state \u001b[38;5;241m==\u001b[39m FINISHED:\n\u001b[0;32m--> 449\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m__get_result\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    451\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_condition\u001b[38;5;241m.\u001b[39mwait(timeout)\n\u001b[1;32m    453\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_state \u001b[38;5;129;01min\u001b[39;00m [CANCELLED, CANCELLED_AND_NOTIFIED]:\n",
      "File \u001b[0;32m/usr/lib/python3.10/concurrent/futures/_base.py:401\u001b[0m, in \u001b[0;36mFuture.__get_result\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m    399\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_exception:\n\u001b[1;32m    400\u001b[0m     \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 401\u001b[0m         \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_exception\n\u001b[1;32m    402\u001b[0m     \u001b[38;5;28;01mfinally\u001b[39;00m:\n\u001b[1;32m    403\u001b[0m         \u001b[38;5;66;03m# Break a reference cycle with the exception in self._exception\u001b[39;00m\n\u001b[1;32m    404\u001b[0m         \u001b[38;5;28mself\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m\n",
      "File \u001b[0;32m~/.local/share/projects/oss/llama_index/.venv/lib/python3.10/site-packages/tenacity/__init__.py:382\u001b[0m, in \u001b[0;36mRetrying.__call__\u001b[0;34m(self, fn, *args, **kwargs)\u001b[0m\n\u001b[1;32m    380\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(do, DoAttempt):\n\u001b[1;32m    381\u001b[0m     \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 382\u001b[0m         result \u001b[38;5;241m=\u001b[39m \u001b[43mfn\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    383\u001b[0m     \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mBaseException\u001b[39;00m:  \u001b[38;5;66;03m# noqa: B902\u001b[39;00m\n\u001b[1;32m    384\u001b[0m         retry_state\u001b[38;5;241m.\u001b[39mset_exception(sys\u001b[38;5;241m.\u001b[39mexc_info())  \u001b[38;5;66;03m# type: ignore[arg-type]\u001b[39;00m\n",
      "File \u001b[0;32m/workspaces/llama_index/gllama_index/llms/cohere_utils.py:72\u001b[0m, in \u001b[0;36mcompletion_with_retry.<locals>._completion_with_retry\u001b[0;34m(**kwargs)\u001b[0m\n\u001b[1;32m     70\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m client\u001b[38;5;241m.\u001b[39mchat(\u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n\u001b[1;32m     71\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m---> 72\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mclient\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgenerate\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/.local/share/projects/oss/llama_index/.venv/lib/python3.10/site-packages/cohere/client.py:221\u001b[0m, in \u001b[0;36mClient.generate\u001b[0;34m(self, prompt, prompt_vars, model, preset, num_generations, max_tokens, temperature, k, p, frequency_penalty, presence_penalty, end_sequences, stop_sequences, return_likelihoods, truncate, logit_bias, stream)\u001b[0m\n\u001b[1;32m    164\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"Generate endpoint.\u001b[39;00m\n\u001b[1;32m    165\u001b[0m \u001b[38;5;124;03mSee https://docs.cohere.ai/reference/generate for advanced arguments\u001b[39;00m\n\u001b[1;32m    166\u001b[0m \n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    200\u001b[0m \u001b[38;5;124;03m        >>>     print(token)\u001b[39;00m\n\u001b[1;32m    201\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m    202\u001b[0m json_body \u001b[38;5;241m=\u001b[39m {\n\u001b[1;32m    203\u001b[0m     \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmodel\u001b[39m\u001b[38;5;124m\"\u001b[39m: model,\n\u001b[1;32m    204\u001b[0m     \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mprompt\u001b[39m\u001b[38;5;124m\"\u001b[39m: prompt,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    219\u001b[0m     \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstream\u001b[39m\u001b[38;5;124m\"\u001b[39m: stream,\n\u001b[1;32m    220\u001b[0m }\n\u001b[0;32m--> 221\u001b[0m response \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_request\u001b[49m\u001b[43m(\u001b[49m\u001b[43mcohere\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mGENERATE_URL\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mjson\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mjson_body\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mstream\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mstream\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    222\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m stream:\n\u001b[1;32m    223\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m StreamingGenerations(response)\n",
      "File \u001b[0;32m~/.local/share/projects/oss/llama_index/.venv/lib/python3.10/site-packages/cohere/client.py:927\u001b[0m, in \u001b[0;36mClient._request\u001b[0;34m(self, endpoint, json, files, method, stream, params)\u001b[0m\n\u001b[1;32m    924\u001b[0m     \u001b[38;5;28;01mexcept\u001b[39;00m jsonlib\u001b[38;5;241m.\u001b[39mdecoder\u001b[38;5;241m.\u001b[39mJSONDecodeError:  \u001b[38;5;66;03m# CohereAPIError will capture status\u001b[39;00m\n\u001b[1;32m    925\u001b[0m         \u001b[38;5;28;01mraise\u001b[39;00m CohereAPIError\u001b[38;5;241m.\u001b[39mfrom_response(response, message\u001b[38;5;241m=\u001b[39m\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mFailed to decode json body: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mresponse\u001b[38;5;241m.\u001b[39mtext\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m--> 927\u001b[0m     \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_check_response\u001b[49m\u001b[43m(\u001b[49m\u001b[43mjson_response\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mresponse\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mheaders\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mresponse\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mstatus_code\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    928\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m json_response\n",
      "File \u001b[0;32m~/.local/share/projects/oss/llama_index/.venv/lib/python3.10/site-packages/cohere/client.py:869\u001b[0m, in \u001b[0;36mClient._check_response\u001b[0;34m(self, json_response, headers, status_code)\u001b[0m\n\u001b[1;32m    867\u001b[0m     logger\u001b[38;5;241m.\u001b[39mwarning(headers[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mX-API-Warning\u001b[39m\u001b[38;5;124m\"\u001b[39m])\n\u001b[1;32m    868\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmessage\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;129;01min\u001b[39;00m json_response:  \u001b[38;5;66;03m# has errors\u001b[39;00m\n\u001b[0;32m--> 869\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m CohereAPIError(\n\u001b[1;32m    870\u001b[0m         message\u001b[38;5;241m=\u001b[39mjson_response[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmessage\u001b[39m\u001b[38;5;124m\"\u001b[39m],\n\u001b[1;32m    871\u001b[0m         http_status\u001b[38;5;241m=\u001b[39mstatus_code,\n\u001b[1;32m    872\u001b[0m         headers\u001b[38;5;241m=\u001b[39mheaders,\n\u001b[1;32m    873\u001b[0m     )\n\u001b[1;32m    874\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;241m400\u001b[39m \u001b[38;5;241m<\u001b[39m\u001b[38;5;241m=\u001b[39m status_code \u001b[38;5;241m<\u001b[39m \u001b[38;5;241m500\u001b[39m:\n\u001b[1;32m    875\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m CohereAPIError(\n\u001b[1;32m    876\u001b[0m         message\u001b[38;5;241m=\u001b[39m\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mUnexpected client error (status \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mstatus_code\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m): \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mjson_response\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m    877\u001b[0m         http_status\u001b[38;5;241m=\u001b[39mstatus_code,\n\u001b[1;32m    878\u001b[0m         headers\u001b[38;5;241m=\u001b[39mheaders,\n\u001b[1;32m    879\u001b[0m     )\n",
      "\u001b[0;31mCohereAPIError\u001b[0m: invalid api token"
     ]
    }
   ],
   "source": [
    "from llama_index.llms.cohere import Cohere\n",
    "\n",
    "llm_good = Cohere(api_key=api_key)\n",
    "llm_bad = Cohere(model=\"command\", api_key=\"BAD_KEY\")\n",
    "\n",
    "resp = llm_good.complete(\"Paul Graham is \")\n",
    "print(resp)\n",
    "\n",
    "resp = llm_bad.complete(\"Paul Graham is \")\n",
    "print(resp)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
